Here, we’re just setting a few options.
knitr::opts_chunk$set(
warning = TRUE, # show warnings during codebook generation
message = TRUE, # show messages during codebook generation
error = TRUE, # do not interrupt codebook generation in case of errors,
# usually better for debugging
echo = TRUE # show R code
)
ggplot2::theme_set(ggplot2::theme_bw())
## Warning: replacing previous import 'vctrs::data_frame' by 'tibble::data_frame'
## when loading 'dplyr'
Now, we’re preparing our data for the codebook.
library(codebook)
webshot::install_phantomjs()
## It seems that the version of `phantomjs` installed is greater than or equal to the requested version.To install the requested version or downgrade to another version, use `force = TRUE`.
library(labelled)
##
## Attaching package: 'labelled'
## The following object is masked from 'package:codebook':
##
## to_factor
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
# codebook_data <- codebook::bfi
# to import an SPSS file from the same folder uncomment and edit the line below
# codebook_data <- rio::import("mydata.sav")
# for Stata
# codebook_data <- rio::import("mydata.dta")
# for CSV
codebook_data <- rio::import("peril_reliability_deid.csv")
codebook_dictionary <- rio::import("peril_reliability_deid_codebook.csv")
var_label(codebook_data) <- codebook_dictionary %>% select(variable, label) %>% dict_to_list()
metadata(codebook_data)$name <- 'Reliability Dataset Codebook'
metadata(codebook_data)$description <- "Reliability data associated with paper 'Dangerous ground: One-year-old infants are sensitive to peril in other agents’ action plans'"
metadata(codebook_data)$creator <- "Shari Liu"
metadata(codebook_data)$datePublished <- "2022-04-12"
# omit the following lines, if your missing values are already properly labelled
# codebook_data <- detect_missing(codebook_data,
# only_labelled = TRUE, # only labelled values are autodetected as
# # missing
# negative_values_are_missing = FALSE, # negative values are missing values
# ninety_nine_problems = TRUE, # 99/999 are missing values, if they
# # are more than 5 MAD from the median
# )
# If you are not using formr, the codebook package needs to guess which items
# form a scale. The following line finds item aggregates with names like this:
# scale = scale_1 + scale_2R + scale_3R
# identifying these aggregates allows the codebook function to
# automatically compute reliabilities.
# However, it will not reverse items automatically.
# codebook_data <- detect_scales(codebook_data)
Create codebook
skim_codebook(codebook_data)
| Name | data |
| Number of rows | 408 |
| Number of columns | 9 |
| _______________________ | |
| Column type frequency: | |
| character | 6 |
| numeric | 3 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| subj | 0 | 1 | 1 | 6 | 0 | 102 | 0 |
| experiment.orig | 0 | 1 | 3 | 5 | 0 | 6 | 0 |
| experiment.paper | 0 | 1 | 4 | 11 | 0 | 6 | 0 |
| experiment.new | 0 | 1 | 12 | 19 | 0 | 6 | 0 |
| trial | 0 | 1 | 5 | 5 | 0 | 4 | 0 |
| coder | 0 | 1 | 5 | 7 | 0 | 3 | 0 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | min | median | max | hist |
|---|---|---|---|---|---|---|---|---|
| orig.look | 16 | 0.96 | 20.66 | 16.25 | 0.95 | 14.59 | 60 | ▇▆▂▂▂ |
| trialn | 0 | 1.00 | 2.50 | 1.12 | 1.00 | 2.50 | 4 | ▇▇▁▇▇ |
| secondary.look | 12 | 0.97 | 20.76 | 16.39 | 0.95 | 14.59 | 60 | ▇▅▂▂▂ |
codebook(codebook_data)
Dataset name: Reliability Dataset Codebook
Reliability data associated with paper ‘Dangerous ground: One-year-old infants are sensitive to peril in other agents’ action plans’
Metadata for search engines
Date published: 2022-04-12
Creator:
| name | value |
|---|---|
| 1 | Shari Liu |
|
#Variables
de-identified subject id
Distribution of values for subj
0 missing values.
| name | label | data_type | n_missing | complete_rate | n_unique | empty | min | max | whitespace |
|---|---|---|---|---|---|---|---|---|---|
| subj | de-identified subject id | character | 0 | 1 | 102 | 0 | 1 | 6 | 0 |
original name of experiment
Distribution of values for experiment.orig
0 missing values.
| name | label | data_type | n_missing | complete_rate | n_unique | empty | min | max | whitespace |
|---|---|---|---|---|---|---|---|---|---|
| experiment.orig | original name of experiment | character | 0 | 1 | 6 | 0 | 3 | 5 | 0 |
name of expeirment used in paper
Distribution of values for experiment.paper
0 missing values.
| name | label | data_type | n_missing | complete_rate | n_unique | empty | min | max | whitespace |
|---|---|---|---|---|---|---|---|---|---|
| experiment.paper | name of expeirment used in paper | character | 0 | 1 | 6 | 0 | 4 | 11 | 0 |
specific name of sample to distinguish between older and younger infants
Distribution of values for experiment.new
0 missing values.
| name | label | data_type | n_missing | complete_rate | n_unique | empty | min | max | whitespace |
|---|---|---|---|---|---|---|---|---|---|
| experiment.new | specific name of sample to distinguish between older and younger infants | character | 0 | 1 | 6 | 0 | 12 | 19 | 0 |
which test trial (test1-test4)
Distribution of values for trial
0 missing values.
| name | label | data_type | n_missing | complete_rate | n_unique | empty | min | max | whitespace |
|---|---|---|---|---|---|---|---|---|---|
| trial | which test trial (test1-test4) | character | 0 | 1 | 4 | 0 | 5 | 5 | 0 |
looking time generated by original offline coding (used in analysis)
Distribution of values for orig.look
16 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| orig.look | looking time generated by original offline coding (used in analysis) | numeric | 16 | 0.9607843 | 0.95 | 15 | 60 | 20.66207 | 16.25162 | ▇▆▂▂▂ |
index of trial
Distribution of values for trialn
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| trialn | index of trial | numeric | 0 | 1 | 1 | 2.5 | 4 | 2.5 | 1.119407 | ▇▇▁▇▇ |
looking time generated by second coder
Distribution of values for secondary.look
12 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| secondary.look | looking time generated by second coder | numeric | 12 | 0.9705882 | 0.95 | 15 | 60 | 20.75837 | 16.38608 | ▇▅▂▂▂ |
who did the secondary coding
Distribution of values for coder
0 missing values.
| name | label | data_type | n_missing | complete_rate | n_unique | empty | min | max | whitespace |
|---|---|---|---|---|---|---|---|---|---|
| coder | who did the secondary coding | character | 0 | 1 | 3 | 0 | 5 | 7 | 0 |
JSON-LD metadata
The following JSON-LD can be found by search engines, if you share this codebook publicly on the web.
{
"name": "Reliability Dataset Codebook",
"description": "Reliability data associated with paper 'Dangerous ground: One-year-old infants are sensitive to peril in other agents’ action plans'\n\n\n## Table of variables\nThis table contains variable names, labels, and number of missing values.\nSee the complete codebook for more.\n\n|name |label | n_missing|\n|:----------------|:------------------------------------------------------------------------|---------:|\n|subj |de-identified subject id | 0|\n|experiment.orig |original name of experiment | 0|\n|experiment.paper |name of expeirment used in paper | 0|\n|experiment.new |specific name of sample to distinguish between older and younger infants | 0|\n|trial |which test trial (test1-test4) | 0|\n|orig.look |looking time generated by original offline coding (used in analysis) | 16|\n|trialn |index of trial | 0|\n|secondary.look |looking time generated by second coder | 12|\n|coder |who did the secondary coding | 0|\n\n### Note\nThis dataset was automatically described using the [codebook R package](https://rubenarslan.github.io/codebook/) (version 0.9.2).",
"creator": "Shari Liu",
"datePublished": "2022-04-12",
"keywords": ["subj", "experiment.orig", "experiment.paper", "experiment.new", "trial", "orig.look", "trialn", "secondary.look", "coder"],
"@context": "http://schema.org/",
"@type": "Dataset",
"variableMeasured": [
{
"name": "subj",
"description": "de-identified subject id",
"@type": "propertyValue"
},
{
"name": "experiment.orig",
"description": "original name of experiment",
"@type": "propertyValue"
},
{
"name": "experiment.paper",
"description": "name of expeirment used in paper",
"@type": "propertyValue"
},
{
"name": "experiment.new",
"description": "specific name of sample to distinguish between older and younger infants",
"@type": "propertyValue"
},
{
"name": "trial",
"description": "which test trial (test1-test4)",
"@type": "propertyValue"
},
{
"name": "orig.look",
"description": "looking time generated by original offline coding (used in analysis)",
"@type": "propertyValue"
},
{
"name": "trialn",
"description": "index of trial",
"@type": "propertyValue"
},
{
"name": "secondary.look",
"description": "looking time generated by second coder",
"@type": "propertyValue"
},
{
"name": "coder",
"description": "who did the secondary coding",
"@type": "propertyValue"
}
]
}`